Analyzing the data sets to summarize their main characteristics of variables, often with visual graphs, without using a statistical model.
Understanding the dimensions of the dataset, variable names, overall missing summary and data types of each variables
# Overview of the data
ExpData(data=data,type=1)
# Structure of the data
ExpData(data=data,type=2)
Overview of the data
Target variable
Summary of continuous dependent variable
Summary statistics when dependent variable is Continuous mpg.
ExpNumStat(data,by="A",gp=Target,Qnt=seq(0,1,0.1),MesofShape=2,Outlier=TRUE,round=2)
Graphical representation of all numeric features, used below types of plots to explore the data
Quantile-quantile plot for all Numerical variables
ExpOutQQ(data,nlim=4,fname=NULL,Page=c(2,2),sample=sn)
## $`0`
Density plot for all numerical variables
ExpNumViz(data,target=NULL,nlim=10,fname=NULL,col=NULL,theme=theme,Page=c(2,2),sample=sn)
## $`0`
Scatter plot between all numeric variables and target variable mpg. This plot help to examine how well a target variable is correlated with list of dependent variables in the data set.
ExpNumViz(data,target=NULL,nlim=5,Page=c(2,1),theme=theme,sample=sn,scatter=TRUE)
## $`0`
Dependent variable is mpg (continuous).
ExpNumViz(data,target=Target,nlim=5,fname=NULL,col=NULL,theme=theme,Page=c(2,2),sample=sn)
## $`0`
** Correlation summary table
ExpNumStat(data,by="GA",gp=Target,MesofShape=2,Outlier=FALSE,round=2,dcast=T,val="cor")
Summary of categorical variables
ExpCTable(data,margin=1,clim=10,nlim=5,round=2,per=T)
##bin=4, descretized 4 categories based on quantiles
ExpCTable(data,Target=Target,margin=1,clim=10,nlim=5,round=2,bin=4,per=T)
Graphical representation of all Categorical variables
Bar plot with vertical or horizontal bars for all categorical variables
ExpCatViz(data,clim=10,margin=2,theme=theme,Page = c(2,2),sample=sc)
## $`0`